{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ur8xi4C7S06n"
      },
      "outputs": [],
      "source": [
        "# Copyright 2024 Google LLC\n",
        "#\n",
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "#     https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JAPoU8Sm5E6e"
      },
      "source": [
        "# Using OpenAI libraries with Gemini on Vertex AI\n",
        "\n",
        "<table align=\"left\">\n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/chat-completions/intro_chat_completions_api.ipynb\">\n",
        "      <img width=\"32px\" src=\"https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg\" alt=\"Google Colaboratory logo\"><br> Open in Colab\n",
        "    </a>\n",
        "  </td>\n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fchat-completions%2Fintro_chat_completions_api.ipynb\">\n",
        "      <img width=\"32px\" src=\"https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN\" alt=\"Google Cloud Colab Enterprise logo\"><br> Open in Colab Enterprise\n",
        "    </a>\n",
        "  </td>    \n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/chat-completions/intro_chat_completions_api.ipynb\">\n",
        "      <img src=\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\" alt=\"Vertex AI logo\"><br> Open in Workbench\n",
        "    </a>\n",
        "  </td>\n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/chat-completions/intro_chat_completions_api.ipynb\">\n",
        "      <img width=\"32px\" src=\"https://raw.githubusercontent.com/primer/octicons/refs/heads/main/icons/mark-github-24.svg\" alt=\"GitHub logo\"><br> View on GitHub\n",
        "    </a>\n",
        "  </td>\n",
        "  <td style=\"text-align: center\">\n",
        "    <a href=\"https://goo.gle/4jeQztq\">\n",
        "      <img width=\"32px\" src=\"https://cdn.qwiklabs.com/assets/gcp_cloud-e3a77215f0b8bfa9b3f611c0d2208c7e8708ed31.svg\" alt=\"Google Cloud logo\"><br> Open in  Cloud Skills Boost\n",
        "    </a>\n",
        "  </td>\n",
        "</table>\n",
        "\n",
        "<div style=\"clear: both;\"></div>\n",
        "\n",
        "<b>Share to:</b>\n",
        "\n",
        "<a href=\"https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/chat-completions/intro_chat_completions_api.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg\" alt=\"LinkedIn logo\">\n",
        "</a>\n",
        "\n",
        "<a href=\"https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/chat-completions/intro_chat_completions_api.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg\" alt=\"Bluesky logo\">\n",
        "</a>\n",
        "\n",
        "<a href=\"https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/chat-completions/intro_chat_completions_api.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg\" alt=\"X logo\">\n",
        "</a>\n",
        "\n",
        "<a href=\"https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/chat-completions/intro_chat_completions_api.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png\" alt=\"Reddit logo\">\n",
        "</a>\n",
        "\n",
        "<a href=\"https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/chat-completions/intro_chat_completions_api.ipynb\" target=\"_blank\">\n",
        "  <img width=\"20px\" src=\"https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg\" alt=\"Facebook logo\">\n",
        "</a>            "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "84f0f73a0f76"
      },
      "source": [
        "| Author |\n",
        "| --- |\n",
        "| [Eric Dong](https://github.com/gericdong) |"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tvgnzT1CKxrO"
      },
      "source": [
        "## Overview\n",
        "\n",
        "Developers already working with OpenAI's libraries can easily tap into the power of Gemini by leveraging the Chat Completions API. The Chat Completions API offers a streamlined way to experiment with and incorporate Gemini's capabilities into your existing AI applications.\n",
        "\n",
        "If you are not already using the OpenAI libraries, we recommend using the Google Gen AI SDK. Learn more about [using OpenAI libraries with Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/migrate/openai/overview).\n",
        "\n",
        "In this tutorial, you learn to call Gemini using the OpenAI library. You will complete the following tasks:\n",
        "\n",
        "- Configure OpenAI SDK for the Chat Completions API\n",
        "- Send a chat completions request\n",
        "- Stream chat completions response\n",
        "- Send a multimodal request\n",
        "- Send a function calling request\n",
        "- Send a function calling request with the `tool_choice` parameter\n",
        "- Use controlled generation\n",
        "- Control thinking budget\n",
        "- Set safety settings\n",
        "- Use context caching\n",
        "- Use thought signature"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "61RBz8LLbxCR"
      },
      "source": [
        "## Get started"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "No17Cw5hgx12"
      },
      "source": [
        "### Install required packages\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "tFy3H3aPgx12"
      },
      "outputs": [],
      "source": [
        "%pip install --upgrade --quiet openai google-auth google-genai requests"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dmWOrTJ3gx13"
      },
      "source": [
        "### Authenticate your notebook environment (Colab only)\n",
        "\n",
        "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "NyKGtVQjgx13"
      },
      "outputs": [],
      "source": [
        "import sys\n",
        "\n",
        "if \"google.colab\" in sys.modules:\n",
        "    from google.colab import auth\n",
        "\n",
        "    auth.authenticate_user()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DF4l8DTdWgPY"
      },
      "source": [
        "### Set Google Cloud project information and initialize Vertex AI SDK\n",
        "\n",
        "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
        "\n",
        "Learn more about [setting up a project and development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Nqwi-5ufWp_B"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "\n",
        "# fmt: off\n",
        "PROJECT_ID = \"[your-project-id]\"  # @param {type: \"string\", placeholder: \"[your-project-id]\", isTemplate: true}\n",
        "# fmt: on\n",
        "if not PROJECT_ID or PROJECT_ID == \"[your-project-id]\":\n",
        "    PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
        "\n",
        "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"global\")\n",
        "\n",
        "print(f\"Using Vertex AI with project: {PROJECT_ID} in location: {LOCATION}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EdvJRUWRNGHE"
      },
      "source": [
        "## Chat Completions API Examples"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vHwJCyNF6u0O"
      },
      "source": [
        "### Configure OpenAI SDK for the Chat Completions API\n",
        "\n",
        "#### Import libraries\n",
        "\n",
        "The `google-auth` library is used to programmatically get Google credentials. Colab already has this library pre-installed."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "kolJkBLdHaw0"
      },
      "outputs": [],
      "source": [
        "import openai\n",
        "from IPython.display import Markdown, display\n",
        "from google.auth import default\n",
        "from google.auth.transport.requests import Request\n",
        "from google.genai.types import Content, CreateCachedContentConfig, Part"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "W0K6VSJRHhH2"
      },
      "source": [
        "#### Authentication\n",
        "\n",
        " Request an access token for authentication to Google APIs, Google Cloud services, and customer-created services hosted on Google Cloud. Note that the access token lives for [1 hour by default](https://cloud.google.com/docs/authentication/token-types#at-lifetime); after expiration, it must be refreshed.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "i0qceuiQEPHv"
      },
      "outputs": [],
      "source": [
        "credentials, _ = default(scopes=[\"https://www.googleapis.com/auth/cloud-platform\"])\n",
        "credentials.refresh(Request())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Q04wJmA0HT6X"
      },
      "source": [
        "Then configure the OpenAI SDK to point to the Chat Completions API endpoint."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "c-MRhsnlj6iw"
      },
      "outputs": [],
      "source": [
        "api_host = \"aiplatform.googleapis.com\"\n",
        "if LOCATION != \"global\":\n",
        "    api_host = f\"{LOCATION}-aiplatform.googleapis.com\"\n",
        "\n",
        "client = openai.OpenAI(\n",
        "    base_url=f\"https://{api_host}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi\",\n",
        "    api_key=credentials.token,\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UGokrtdiIHrX"
      },
      "source": [
        "### Supported models\n",
        "\n",
        "The Chat Completions API supports both Gemini models and select self-deployed models from Model Garden. Learn more about [supported models](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#supported_models)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "r7OhyH46H2H5"
      },
      "outputs": [],
      "source": [
        "MODEL_ID = \"google/gemini-2.5-flash\"  # @param {type:\"string\"}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uiWrF7U4Q5-E"
      },
      "source": [
        "### **Example**: Send a chat completions request\n",
        "\n",
        "The Chat Completions API takes a list of messages as input and returns a generated message as output. Although the message format is designed to make multi-turn conversations easy, it's just as useful for single-turn tasks without any conversation.\n",
        "\n",
        "In this example, you use the Chat Completions API to send a request to the Gemini model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "JQTWLCzoHyUA"
      },
      "outputs": [],
      "source": [
        "response = client.chat.completions.create(\n",
        "    model=MODEL_ID, messages=[{\"role\": \"user\", \"content\": \"Why is the sky blue?\"}]\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jLNZGyOVOr1C"
      },
      "source": [
        "An example Chat Completions API response looks as follows:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "DdfWBfggOvtf"
      },
      "outputs": [],
      "source": [
        "print(response)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ipbzw5FmO5zX"
      },
      "source": [
        "The generated content can be extracted with:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "LxpdxYCxH51u"
      },
      "outputs": [],
      "source": [
        "response.choices[0].message.content"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nSIVXw7bMr4F"
      },
      "source": [
        "You can use `Markdown` to display the formatted text."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zOwEvNkCMiMF"
      },
      "outputs": [],
      "source": [
        "Markdown(response.choices[0].message.content)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DpHc6sC6PIcI"
      },
      "source": [
        "### **Example**: Stream chat completions response\n",
        "\n",
        "By default, the model returns a response after completing the entire generation process. You can also stream the response as it is being generated, and the model will return chunks of the response as soon as they are generated."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3hO8yLqnJEgh"
      },
      "outputs": [],
      "source": [
        "output_text = \"\"\n",
        "markdown_display_area = display(Markdown(output_text), display_id=True)\n",
        "\n",
        "for chunk in client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=[{\"role\": \"user\", \"content\": \"Why is the sky blue?\"}],\n",
        "    stream=True,\n",
        "):\n",
        "    output_text += chunk.choices[0].delta.content\n",
        "    markdown_display_area.update(Markdown(output_text))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kqNdZLK8VLaf"
      },
      "source": [
        "### **Example**: Send a multimodal request\n",
        "You can send a multimodal prompt in a request to Gemini and get a text output.\n",
        "\n",
        "In this example, you ask the model to create a blog post for [this image](https://storage.googleapis.com/cloud-samples-data/generative-ai/image/meal.png) stored in a Cloud Storage bucket.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "KOIu1Me-PQ6e"
      },
      "outputs": [],
      "source": [
        "response = client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=[\n",
        "        {\n",
        "            \"role\": \"user\",\n",
        "            \"content\": [\n",
        "                {\n",
        "                    \"type\": \"text\",\n",
        "                    \"text\": \"Write a short, engaging blog post based on this picture.\",\n",
        "                },\n",
        "                {\n",
        "                    \"type\": \"image_url\",\n",
        "                    \"image_url\": \"gs://cloud-samples-data/generative-ai/image/meal.png\",\n",
        "                },\n",
        "            ],\n",
        "        }\n",
        "    ],\n",
        ")\n",
        "\n",
        "Markdown(response.choices[0].message.content)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JPJot5D_dE6G"
      },
      "source": [
        "### **Example**: Send a function calling request\n",
        "\n",
        "You can use the Chat Completions API for function calling with the Gemini models. The `tools` parameter in the API is used to provide function specifications. This is to enable models to generate function arguments which adhere to the provided specifications.\n",
        "\n",
        "In this example, you create function specifications to interface with a hypothetical weather API, then pass these function specifications to the Chat Completions API to generate function arguments that adhere to the specification.\n",
        "\n",
        "**Note** that in this example, the API will not actually execute any function calls. It is up to developers to execute function calls using model outputs."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "arshz2C4W_ck"
      },
      "outputs": [],
      "source": [
        "tools = [\n",
        "    {\n",
        "        \"type\": \"function\",\n",
        "        \"function\": {\n",
        "            \"name\": \"get_current_weather\",\n",
        "            \"description\": \"Get the current weather in a given location\",\n",
        "            \"parameters\": {\n",
        "                \"type\": \"object\",\n",
        "                \"properties\": {\n",
        "                    \"location\": {\n",
        "                        \"type\": \"string\",\n",
        "                        \"description\": \"The city and state, e.g. San Francisco, CA or a zip code e.g. 95616\",\n",
        "                    },\n",
        "                },\n",
        "                \"required\": [\"location\"],\n",
        "            },\n",
        "        },\n",
        "    }\n",
        "]\n",
        "\n",
        "messages = [\n",
        "    {\n",
        "        \"role\": \"system\",\n",
        "        \"content\": \"Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.\",\n",
        "    },\n",
        "    {\"role\": \"user\", \"content\": \"What is the weather in Boston?\"},\n",
        "]\n",
        "\n",
        "response = client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=messages,\n",
        "    tools=tools,\n",
        ")\n",
        "\n",
        "print(response.choices[0].message.tool_calls)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "B82QRs9Oc1HL"
      },
      "source": [
        "### **Example**: Send a function calling request with the `tool_choice` parameter\n",
        "\n",
        "Using the `tools` parameter, if the functions parameter is provided then by default the model will decide when it is appropriate to use one of the functions.\n",
        "\n",
        "By default, `tool_choice` is set to `auto`. This lets the model decide whether to call functions and, if so, which functions to call. To disable function calling and force the model to only generate a user-facing message, you can set `tool_choice` to `none`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "GoEv7CKNXZu1"
      },
      "outputs": [],
      "source": [
        "tools = [\n",
        "    {\n",
        "        \"type\": \"function\",\n",
        "        \"function\": {\n",
        "            \"name\": \"get_current_weather\",\n",
        "            \"description\": \"Get the current weather in a given location\",\n",
        "            \"parameters\": {\n",
        "                \"type\": \"object\",\n",
        "                \"properties\": {\n",
        "                    \"location\": {\n",
        "                        \"type\": \"string\",\n",
        "                        \"description\": \"The city and state, e.g. San Francisco, CA or a zip code e.g. 95616\",\n",
        "                    },\n",
        "                },\n",
        "                \"required\": [\"location\"],\n",
        "            },\n",
        "        },\n",
        "    }\n",
        "]\n",
        "\n",
        "messages = [\n",
        "    {\n",
        "        \"role\": \"system\",\n",
        "        \"content\": \"Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.\",\n",
        "    },\n",
        "    {\"role\": \"user\", \"content\": \"What is the weather in Boston?\"},\n",
        "]\n",
        "\n",
        "response = client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=messages,\n",
        "    tools=tools,\n",
        "    tool_choice=\"auto\",\n",
        ")\n",
        "\n",
        "print(response.choices[0].message.tool_calls)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0706c915f870"
      },
      "source": [
        "### **Example**: Use controlled generation\n",
        "\n",
        "The Gemini models allow you to define a response schema to specify the structure of a model's output, the field names, and the expected data type for each field. The response schema is specified in the `response_format` parameter, and the model output will strictly follow that schema."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "735d2eafeb2c"
      },
      "outputs": [],
      "source": [
        "from pydantic import BaseModel, RootModel\n",
        "\n",
        "class Recipe(BaseModel):\n",
        "    name: str\n",
        "    description: str\n",
        "    ingredients: list[str]\n",
        "\n",
        "\n",
        "class RecipeList(RootModel):\n",
        "    root: list[Recipe]\n",
        "\n",
        "    def __iter__(self):\n",
        "        return iter(self.root)\n",
        "\n",
        "    def __getitem__(self, item):\n",
        "        return self.root[item]\n",
        "\n",
        "\n",
        "response = client.beta.chat.completions.parse(\n",
        "    model=MODEL_ID,\n",
        "    messages=[\n",
        "        {\n",
        "            \"role\": \"user\",\n",
        "            \"content\": \"List a few popular cookie recipes and their ingredients.\",\n",
        "        }\n",
        "    ],\n",
        "    response_format=RecipeList,\n",
        ")\n",
        "\n",
        "print(response.choices[0].message)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3dd5a66689a2"
      },
      "outputs": [],
      "source": [
        "recipes: RecipeList = response.choices[0].message.parsed\n",
        "\n",
        "for recipe in recipes:\n",
        "    print(f\"Recipe: {recipe.name}\")\n",
        "    print(f\"Description: {recipe.description}\")\n",
        "    print(f\"Ingredients: {', '.join(recipe.ingredients)}\")\n",
        "    print()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3857c959d8f5"
      },
      "source": [
        "### **Example**: Control thinking budget\n",
        "\n",
        "Gemini 2.5 models are trained to think through complex problems, leading to significantly improved reasoning. The Gemini API comes with a [`thinking_budget`](https://cloud.google.com/vertex-ai/generative-ai/docs/thinking#budget) parameter which gives you control how much the model thinks during its responses.\n",
        "\n",
        "See the following mapping between the OpenAI API parameter `reasoning_effort` and the Gemini API parameter `thinking_budget`:\n",
        "\n",
        "| `reasoning_effort` | `thinking_budget` |\n",
        "|--------------------|-------------------|\n",
        "| `none`             | `0`               |\n",
        "| `low`              | `1024`            |\n",
        "| `medium`           | `8192`            |\n",
        "| `high`             | `24576`           |"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "98f16a7eb5c9"
      },
      "outputs": [],
      "source": [
        "prompt = \"\"\"\n",
        "Write a bash script that takes a matrix represented as a string with\n",
        "format '[1,2],[3,4],[5,6]' and prints the transpose in the same format.\n",
        "\"\"\"\n",
        "\n",
        "response = client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    reasoning_effort=\"medium\",\n",
        "    messages=[{\"role\": \"user\", \"content\": prompt}],\n",
        ")\n",
        "\n",
        "Markdown(response.choices[0].message.content)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5866e07522f0"
      },
      "source": [
        "## Gemini-specific parameters\n",
        "\n",
        "There are several features supported by Gemini that are not available in OpenAI models. These features can still be passed in as parameters, but must be contained within an `extra_content` or `extra_body` or they will be ignored.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "e45f5f9f8c45"
      },
      "source": [
        "### **Example**: Set safety settings\n",
        "\n",
        "Gemini provides parameter `safety_settings` to set safety thresholds to filter responses from the model. Note that the response message may be `None` if `finish_reason` is `content_filter`, which indicates the generated content is blocked by the safety filters."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "c5f5f3a17ad7"
      },
      "outputs": [],
      "source": [
        "prompt = \"\"\"\n",
        "Write a list of 5 hateful things that I might say to the universe\n",
        "after stubbing my toe in the dark.\n",
        "\"\"\"\n",
        "\n",
        "response = client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=[{\"role\": \"user\", \"content\": prompt}],\n",
        "    extra_body={\n",
        "        \"extra_body\": {\n",
        "            \"google\": {\n",
        "                \"safety_settings\": [\n",
        "                    {\n",
        "                        \"category\": \"HARM_CATEGORY_DANGEROUS_CONTENT\",\n",
        "                        \"threshold\": \"BLOCK_LOW_AND_ABOVE\",\n",
        "                    },\n",
        "                    {\n",
        "                        \"category\": \"HARM_CATEGORY_HATE_SPEECH\",\n",
        "                        \"threshold\": \"BLOCK_LOW_AND_ABOVE\",\n",
        "                    },\n",
        "                    {\n",
        "                        \"category\": \"HARM_CATEGORY_HARASSMENT\",\n",
        "                        \"threshold\": \"BLOCK_LOW_AND_ABOVE\",\n",
        "                    },\n",
        "                    {\n",
        "                        \"category\": \"HARM_CATEGORY_SEXUALLY_EXPLICIT\",\n",
        "                        \"threshold\": \"BLOCK_LOW_AND_ABOVE\",\n",
        "                    },\n",
        "                ]\n",
        "            }\n",
        "        }\n",
        "    },\n",
        ")\n",
        "\n",
        "print(response.choices[0].finish_reason)\n",
        "print(response.choices[0].message)\n",
        "print(response)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "72fa867af0df"
      },
      "source": [
        "### **Example**: Use context caching\n",
        "\n",
        "The Gemini provides the context caching feature for developers to store frequently used input tokens in a dedicated cache and reference them for subsequent requests, eliminating the need to repeatedly pass the same set of tokens to a model. This feature can help reduce the number of tokens sent to the model, thereby lowering the cost of requests that contain repeat content with high input token counts. Learn more about [context caching](https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview).\n",
        "\n",
        "In this example, you create a context cache using the Google Gen AI SDK and then access the cached content using the OpenAI API."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "63d19bd4f32a"
      },
      "outputs": [],
      "source": [
        "from google import genai\n",
        "\n",
        "# Context cache is not available in global endpoint so we use a regional endpoint instead\n",
        "location = \"us-central1\"\n",
        "\n",
        "google_client = genai.Client(vertexai=True, project=PROJECT_ID, location=location)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4eb45f9887b8"
      },
      "source": [
        "Create a context cache using a large research paper stored in a Cloud Storage bucket. The default expiration time of a context cache is 60 minutes. You can specify a different expiration time using the `ttl` (time to live) or the `expire_time` property."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1e0d736259e5"
      },
      "outputs": [],
      "source": [
        "system_instruction = \"\"\"\n",
        "You are an expert researcher who has years of experience in conducting systematic\n",
        "literature surveys of different topics. You pride yourself on incredible accuracy\n",
        "and attention to detail. You always stick to the facts in the sources provided,\n",
        "and never make up new facts. Now look at the research paper below, and answer the\n",
        "following questions in 1-2 sentences.\n",
        "\"\"\"\n",
        "\n",
        "cached_content = google_client.caches.create(\n",
        "    model=MODEL_ID,\n",
        "    config=CreateCachedContentConfig(\n",
        "        contents=[\n",
        "            Content(\n",
        "                role=\"user\",\n",
        "                parts=[\n",
        "                    Part.from_uri(\n",
        "                        file_uri=\"gs://cloud-samples-data/generative-ai/pdf/1706.03762v7.pdf\",\n",
        "                        mime_type=\"application/pdf\",\n",
        "                    ),\n",
        "                ],\n",
        "            )\n",
        "        ],\n",
        "        system_instruction=system_instruction,\n",
        "        ttl=\"3600s\",\n",
        "    ),\n",
        ")\n",
        "\n",
        "print(cached_content.name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "021826c70412"
      },
      "source": [
        "Then configure the OpenAI SDK to point to the Chat Completions API endpoint where the cached content is created."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "90c735708374"
      },
      "outputs": [],
      "source": [
        "api_host = f\"{location}-aiplatform.googleapis.com\"\n",
        "\n",
        "openai_client = openai.OpenAI(\n",
        "    base_url=f\"https://{api_host}/v1/projects/{PROJECT_ID}/locations/{location}/endpoints/openapi\",\n",
        "    api_key=credentials.token,\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "b98054e670a5"
      },
      "source": [
        "To access the context cache using OpenAI SDK, you provide the `cached_content` resource name in the `extra_body` parameter.\n",
        "\n",
        "Then you can query the model with a prompt, and the cached content will be used as a prefix to the prompt."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "4ed75f9bd727"
      },
      "outputs": [],
      "source": [
        "response = openai_client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=[\n",
        "        {\n",
        "            \"role\": \"user\",\n",
        "            \"content\": \"Why did the authors choose sinusoidal over learned positional embeddings?\",\n",
        "        }\n",
        "    ],\n",
        "    extra_body={\n",
        "        \"extra_body\": {\n",
        "            \"google\": {\n",
        "                \"cached_content\": cached_content.name,\n",
        "            }\n",
        "        }\n",
        "    },\n",
        ")\n",
        "\n",
        "Markdown(response.choices[0].message.content)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "a952c50e-8238-4221-a224-0918b75b65a3"
      },
      "source": [
        "### **Example**: Use thought signature\n",
        "\n",
        "A thought signature is an encrypted representation of the model's internal reasoning process for a given turn in a conversation. By passing this signature back to the model in subsequent requests, you provide it with the context of its previous thoughts, allowing it to build upon its reasoning and maintain a coherent line of inquiry.\n",
        "\n",
        "Thought signatures provide a powerful mechanism to maintain context in multi-turn conversations, enabling the model to tackle more complex, multi-step tasks that require reasoning and the use of external tools through function calling."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "e53c1c9e-094d-4548-935c-1a8451a52e10"
      },
      "source": [
        "#### Conditional Thermostat Control\n",
        "\n",
        "In this scenario, a user wants to set a thermostat based on the current weather. The request is: \"If it's too hot or too cold in London, set the thermostat to a comfortable temperature.\"\n",
        "\n",
        "This requires the model to:\n",
        "\n",
        "- Call a tool to get the weather in London.\n",
        "- Use the returned weather information to decide if another tool needs to be called.\n",
        "- Call the tool to set the thermostat if the condition is met."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3f1ab84c-e5a3-485a-929e-d30f688b6523"
      },
      "source": [
        "**Step 1**: Define Functions and Tools\n",
        "\n",
        "First, we define the two functions the model can use: `get_current_temperature` and `set_thermostat_temperature`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "092382b9-08b8-42df-8718-41357f1210e1"
      },
      "outputs": [],
      "source": [
        "import json\n",
        "\n",
        "def get_current_temperature(location: str) -> dict:\n",
        "    \"\"\"Gets the current weather temperature for a given location.\"\"\"\n",
        "    print(f\"Tool Call: get_current_temperature(location={location})\")\n",
        "    # Mocking a real API call.\n",
        "    return {\"temperature\": 30, \"unit\": \"celsius\"}\n",
        "\n",
        "\n",
        "def set_thermostat_temperature(temperature: int) -> dict:\n",
        "    \"\"\"Sets the thermostat to a desired temperature.\"\"\"\n",
        "    print(f\"Tool Call: set_thermostat_temperature(temperature={temperature})\")\n",
        "    # In a real app, this would interact with a thermostat API.\n",
        "    return {\"status\": \"success\"}\n",
        "\n",
        "\n",
        "tools = [\n",
        "    {\n",
        "        \"type\": \"function\",\n",
        "        \"function\": {\n",
        "            \"name\": \"get_current_temperature\",\n",
        "            \"description\": \"Gets the current weather temperature for a given location.\",\n",
        "            \"parameters\": {\n",
        "                \"type\": \"object\",\n",
        "                \"properties\": {\"location\": {\"type\": \"string\"}},\n",
        "                \"required\": [\"location\"],\n",
        "            },\n",
        "        },\n",
        "    },\n",
        "    {\n",
        "        \"type\": \"function\",\n",
        "        \"function\": {\n",
        "            \"name\": \"set_thermostat_temperature\",\n",
        "            \"description\": \"Sets the thermostat to a desired temperature.\",\n",
        "            \"parameters\": {\n",
        "                \"type\": \"object\",\n",
        "                \"properties\": {\"temperature\": {\"type\": \"integer\"}},\n",
        "                \"required\": [\"temperature\"],\n",
        "            },\n",
        "        },\n",
        "    },\n",
        "]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "d9d3a3b1-52d8-4741-9a2a-a43e44180197"
      },
      "source": [
        "**Step 2**: First Turn - Get the Weather\n",
        "\n",
        "We send the initial prompt to the model. We expect it to call the `get_current_temperature` function and return a thought signature to maintain the context of the user's conditional request."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "92b2632b-c5d4-473c-a73e-41e0a32954e2"
      },
      "outputs": [],
      "source": [
        "prompt = \"\"\"\n",
        "If it's too hot or too cold in London, set the temperature to a comfortable level.\n",
        "Make your own reasonable assumption for what 'comfortable' means and do not ask for clarification.\n",
        "\"\"\"\n",
        "\n",
        "messages = [\n",
        "    {\"role\": \"user\", \"content\": prompt},\n",
        "]\n",
        "\n",
        "response1 = client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=messages,\n",
        "    tools=tools,\n",
        "    extra_body={\n",
        "        \"extra_body\": {\n",
        "            \"google\": {\n",
        "                \"thinking_config\": {\n",
        "                    \"include_thoughts\": True,\n",
        "                },\n",
        "            },\n",
        "        },\n",
        "    },\n",
        ")\n",
        "\n",
        "print(response1.choices[0].message.tool_calls)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3a69330b-1a33-40e9-9903-4137d3a2c38b"
      },
      "source": [
        "**Step 3**: Second Turn - Set the Thermostat\n",
        "\n",
        "Now, we send the model's previous response and the result from our first tool call back to the model. The previous response from the model contains the thought signature, which will be passed back to the model. Using this context, the model will decide if it needs to call the `set_thermostat_temperature` function."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "0149957a-4058-4449-b900-24a02a38e99a"
      },
      "outputs": [],
      "source": [
        "# Append user prompt and assistant response to messages\n",
        "messages.append(response1.choices[0].message)\n",
        "\n",
        "# Execute the tool\n",
        "tool_call_1 = response1.choices[0].message.tool_calls[0]\n",
        "result_1 = get_current_temperature(**json.loads(tool_call_1.function.arguments))\n",
        "\n",
        "# Append tool response to messages\n",
        "messages.append(\n",
        "    {\n",
        "        \"role\": \"tool\",\n",
        "        \"tool_call_id\": tool_call_1.id,\n",
        "        \"content\": json.dumps(result_1),\n",
        "    }\n",
        ")\n",
        "\n",
        "response2 = client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=messages,\n",
        "    tools=tools,\n",
        "    extra_body={\n",
        "        \"extra_body\": {\n",
        "            \"google\": {\n",
        "                \"thinking_config\": {\n",
        "                    \"include_thoughts\": True,\n",
        "                },\n",
        "            },\n",
        "        },\n",
        "    },\n",
        ")\n",
        "\n",
        "print(response2.choices[0].message.tool_calls)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "f2f20b6f-3e3f-453c-b441-843a31a8314a"
      },
      "source": [
        "**Step 4**: Final Turn - Get user-facing response\n",
        "\n",
        "Finally, we send the conversation history, including the second function call and its result, back to the model to get a final, user-friendly text response."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "e6f244f9-9b10-489d-810f-386a3840382b"
      },
      "outputs": [],
      "source": [
        "# Append assistant response to messages\n",
        "messages.append(response2.choices[0].message)\n",
        "\n",
        "# Execute the tool\n",
        "tool_call_2 = response2.choices[0].message.tool_calls[0]\n",
        "result_2 = set_thermostat_temperature(**json.loads(tool_call_2.function.arguments))\n",
        "\n",
        "# Append tool response to messages\n",
        "messages.append(\n",
        "    {\n",
        "        \"role\": \"tool\",\n",
        "        \"tool_call_id\": tool_call_2.id,\n",
        "        \"content\": json.dumps(result_2),\n",
        "    }\n",
        ")\n",
        "\n",
        "response3 = client.chat.completions.create(\n",
        "    model=MODEL_ID,\n",
        "    messages=messages,\n",
        "    tools=tools,\n",
        "    extra_body={\n",
        "        \"extra_body\": {\n",
        "            \"google\": {\n",
        "                \"thinking_config\": {\n",
        "                    \"include_thoughts\": True,\n",
        "                },\n",
        "            },\n",
        "        },\n",
        "    },\n",
        ")\n",
        "\n",
        "Markdown(response3.choices[0].message.content)"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "name": "intro_chat_completions_api.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
